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INFORMATION  THEORY  AND  PUBLIC  KEY  CRYPTOSYSTEMS 

INTRODUCTION 

In  1949,  C.E.  Shannon  [1]  laid  the  foundations  for  a  general,  theoretical  analysis  of  secrecy  sys¬ 
tems.  This  paper  establishes  (as  far  as  the  author  knows)  the  first  published  formalization  of  the 
intuitive  notion  of  a  secrecy  system,  hereinafter  called  a  cryptosystem.  At  the  same  time.  Shannon 
introduces  the  concept  of  an  information  theoretic  analysis  of  cryptosystems  to  evaluate  the  theoretical 
security  of  such  systems. 

Shannon  showed  that  the  plaintext  can  be  completely  and  unambiguously  recovered  if  and  only  if 
the  redundancy  of  the  plaintext  is  at  least  as  large  as  the  sum  of  the  noise  and  the  information  content 
of  the  key.  In  the  present  report,  the  redundancy  and  information  content  of  the  key  have  to  be  taken 
in  the  quantitative  sense  defined  in  Ref.  2.  Reference  2  emphasizes  that  this  is  a  theoretical  lower 
limit.  Where  a  cryptosystem  is  breakable  in  principle  according  to  this  condition,  it  could  still  be 
practically  secure.  This  is  possible  because  the  number  of  operations  required  to  determine  the  key 
might  be  excessively  large,  i.e.,  a  cryptosystem  might  have  good  practical  security  but  not  necessarily 
good  theoretical  security. 

Public  key  cryptosystems  represent  an  extreme  case  in  the  relationship  between  key  size  and 
computational  complexity  for  the  cryptoanalyst.  There  is  no  secret  key  information  to  detect  whatso¬ 
ever,  but  instead  there  is  a  formidable  computational  problem.  This  paper  applies  these  information 
theoretical  results  to  public  key  cryptosystems,  in  particular  to  the  RSA  (Rivest-Shamir-Adelman)  sys¬ 
tem.  Information  theory  measures  the  amount  of  information  concerning  plaintext  or  key  that  is  con¬ 
tained  in  a  cryptogram.  This  report  does  not  consider  the  computational  complexity.  A  cryptosystem 
that  is  safe  from  the  information  theoretic  analysis  is  safe  if  the  amount  of  computational  resources 
available  to  the  cryptoanalyst  is  assumed  to  be  bounded. 

A  general  treatment  of  the  complexity  theory  approach  to  cryptology  seems  to  be  very  difficult 
to  handle.  Some  relevant  references  and  an  attempt  to  address  these  problems  are  contained  in  Refs. 
3  and  4.  One  of  the  main  objectives  of  this  research  is  to  try  to  address  this  problem.  More  specifi¬ 
cally,  it  is  hoped  that  with  further  research  the  present  treatments  of  information  theory  can  be  suit¬ 
ably  modified  so  as  to  form  a  basis  for  the  complexity  problem  for  public  key  cryptosystems.  This 
report  shows  that  although  the  current  approach  of  information  theory  does  give  some  security  bounds 
for  classical  cryptosystems,  it  is  not  suitable  for  public  key  systems. 

It  is  interesting  to  note  that  almost  40  years  have  elapsed  since  the  publication  of  Ref.  1  and 
there  has  been  little  additional  research  on  the  information  theory  approach  to  cryptosystems  [2). 
Moreover,  we  note  that  treatments  from  both  Refs.  1  and  2  lack  the  precision  necessary  to  be  con¬ 
sidered  rigorous  mathematical  analyses.  For  example,  let  ud  be  the  unicity  distance  of  a  given  cryp¬ 
tosystem.  Then  it  is  suggested  that  ud  is  a  sort  of  threshold  separating  cryptograms  that  can  be  cryp- 
toanalyzed  from  those  that  cannot.  Meyer  [2]  points  out  that  this  is  not  exactly  correct. 

Here,  we  follow  the  notation  and  terminology  of  Ref.  2.  We  assume  that  the  reader  is  familiar 
with  the  results  from  Ref.  2 . 
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PUBLIC  KEY  CRYPTOGRAPHY 

Public  key  cryptosystems  do  not  require  a  priori  distribution  of  keying  material  to  the  two  par¬ 
ties  who  wish  to  communicate;  this  is  their  main  advantage.  As  an  example  of  a  public  key  scheme, 
we  consider  the  RSA  system.  Briefly,  the  method  works  as  follows:  Two  large  primes  p  and  q  are 
chosen,  and  n  =  p  ■  q  is  computed.  Letting  m  =  4>(n )  =  (p  —  l)(q  -  1),  a  large  random  number 
y  is  chosen  such  that  gcd  (y  ,m )  =  1.  This  step  guarantees  the  existence  of  a  unique  integer  x ,  0  < 
x  <  m ,  such  that  x  •  y  *  1  (mod  m).  If  we  let  X  and  Y  represent  the  plaintext  and  the  correspond¬ 
ing  ciphertext  respectively,  then  the  enciphering  process  consists  of  computing  EPK(X)  =  X'  (mod  n ) 
where  EPK(X)  =  Y  is  such  that  0  <  Y  <  n  -1.  The  encryption  scheme  is  made  public  by  announc¬ 
ing  n  and  y .  Using  Meyer’s  terminology,  let  the  public  key  PK  =  y  and  the  secret  key  SK  =  x . 
To  decode  the  ciphertext  Y  =  EPK(X),  one  computes 

DSK(Y)m  YSK(modn) 

where 


Thus, 


0  ^  Dsk(Y)  s  n  1  . 


DSIC(Y)  m  X*?(mod  n ) 


m  X  (mod  n ) 


by  Euler’s  Theorem. 

INFORMATION  THEORY  AND  PUBLIC  KEY 

Information  theory  applies  a  numerical  measure  of  information  to  a  message.  This  measure  is 
usually  given  in  terms  of  bits.  Following  Meyer,  we  let  X  =  |x , ,  x2,  •  •  •  ,  xr|  denote  the  message 

r 

space  and  associate  with  each  message  a  probability  P(x,)  =  p,  where  =  1.  The  information 

i  =  ! 

associated  with  x,tX  is  - log2  (p,)  bits.  If  each  message  is  equally  likely,  every  x,  has  information 
value  log2r.  For  the  set  X,  the  average  information  per  message  is  defined  to  be  the  entropy  of  X , 
denoted  by  H(X)  and  defined  by 


H(X)  =  £  -(p^  log2  (p^. 

i  =  i 

Here,  H(X)  can  be  interpreted  as  a  measure  of  the  uncertainty  over  which  message  the  sender  will 
select  and  transmit  to  the  receiver.  If  each  message  is  equally  likely,  there  is  a  maximum  uncertainty 
concerning  which  message  will  be  transmitted,  and  H(X)  assumes  its  maximum  H(X)  =  Iog2r.  On 
the  other  extreme,  if  there  is  no  uncertainty  over  which  message  will  be  transmitted,  H(X)  =  0. 
Thus  H(X)  assumes  values  in  the  interval  0  to  log2r. 

If  we  let  K  denote  a  set  of  keys  each  having  an  associated  probability  of  occurrence,  we  can 
then  define  H(K)  in  the  same  manner  as  H(K)  above.  Thus  in  cryptoanalysis,  H(X)  and  H(K)  can 
be  interpreted  as  the  analyst's  prior  information  regarding  which  message  and  key  are  selected  for 
encipherment. 
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Next,  we  list  some  information  measures  used  in  this  study.  Here  U,  V ,  and  W  are  finite  sets 
whose  elements  have  been  assigned  probabilities  such  that 

£P(«)  =  £P(v)  =  EP(w)  =  l. 

U  V  H’ 

(Here  £  means  the  summation  is  over  all  ueU .) 

U 

1.  Conditional  entropy  of  U  given  veV: 

H(U  \v)  =  -  £  P(u  |  v)  log2  P(u  |  v ) 


2.  Equivocation  of  U  given  V: 


H(U  \V)  =  -£  P(v)H(U  |v) 


3.  Entropy  of  U  and  V  : 


H(U ,V)  =  ~Yt  P(u<v)  log2  P(u,v) 

U  .V 


4.  (a)  Equivocation  of  V  given  V  and  W\ 

H(U\V,W)~-  £  P(w,v,H')  log2  P(u  |  v,w) 

U  ,V.K’ 


(b)  Equivocation  of  U  and  V  given  fV: 

H(U,V\W)-  -  £  P(u  ,v,w)  log2  P(u  ,v  |  w) 

u  ,v  ,*v 


5.  Entropy  of  U ,  V ,  W: 


H(U ,V ,W)  =  -  £  P(u,v,w)  log2  P(m,v,w). 

u  ,v,w 


We  also  need  the  identity 

(*)  H((J  \  V  ,W)  +  H{V  \  W)  =  H(V\U,W)  +  H(U\W).  ■ 

To  prove  (*),  let  us  first  show 

(*')  H(U.V)  =  H(U\V)  +  H(V). 
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By  definition  (3  above), 

H(U,V)=  -  £  P(u,v)  Iog2  P(u,  v) 

U,V 

-  -  £  p(u  I  v)P(v)  [log2  P(u  I  V)  +  log2  P(v)] 

II  ,V 

=  “E  P(v)H(U  |  v)  -  ^  />(v)  log2  />(v) 
v  V 

=  H(U\V)  +  H(V) 

by  2  above.  This  proves  (*  '). 

Next  we  claim 

(*")  H(U,V'W)  =  H(U\V,W)  +  H(V,W). 

Proceeding  in  a  manner  similar  to  the  proof  of  (*  '),  by  definition  (5  above) 

H(U ,  V,  W)  =  -  £  P(u  ,v  ,w)  log2  P(u  ,v  ,w) 

U  ,  V  ,  H- 

=  “  L  P(U,V,W)  log2  (P(«  |  V,w)  P(v,H’)) 

14  ,  V  ,H' 

=  -  £  P(u,v,w)  log2  P(u)v,w)  -  £  P(v,w)  log2  P(v.h-) 

U.V.H' 

=  //(C/  |  F,H',)  +  //(K,HO 
by  4(a)  and  3  above.  This  proves  (*  ”). 

Finally,  we  may  now  verify  (*). 

H(U  \  V,W)  +  H(V\W)  =  H(U ,  V,  W)  -  H(V ,  W)  +  H(V\W) 
by  (*  ”).  But  H(U,  V ,  W)  clearly  equals  H(V,  W %  U),  so  we  get 

H(U\V,W)  +  H(V\W)  =  //(K,»',{/)-//(K,HO+//(K|HO. 

By  (*  ")  applied  to  H(V,  W ,  f/)  we  get, 

H(V ,  ,  t/)  =  //(|y  |B',{/)  +  //(tv,  t/). 
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Thus 

H{U\V,W)  +  H(V\W)  =  H(V\W,U)  +  H(W,U)-H(V,W)  +  H(V\W). 
But  (*  ')  gives  H(V,  W)  =  H{ V  |  W)  +  H(W ),  so 

H(U  |  v,w)  +  my  |  w)  =  H(v  \w,u)  +  h(w,  u >  - 

=  I  B',  U)  +  H(U  i  W) 


by  (*  ’)  again.  This  verifies  (*). 

To  apply  these  information  measures  of  theoretical  secrecy  to  the  case  of  public  key  cryptosys¬ 
tems,  we  need  a  definition  of  unicity  distance  for  the  case  when  both  plaintext  and  corresponding 
ciphertext  are  available  for  analysis  since  this  is  the  case  in  such  systems.  But  Meyer  [2,  pp.  63 1  - 
632]  has  given  such  a  definition. 

Following  Meyer,  we  rewrite  equation  (*)  as 

H(K\Y,X)  +  H(Y\X)  =  H(Y\K,X)  +  H{K\X) 

where  Y  is  the  cyptogram  space.  But  since  a  knowledge  of  ktK  and  * tX  determines  y  =  EK(x)tY , 

H(Y  \K,X)  =  0. 

Also  keys  and  messages  are  selected  independently,  so  H(K  \X)  =  H{K).  Thus  (*)  becomes 

H(K\Y,X)  =  H(K)  -  H(Y\X). 

So  there  is  no  uncertainty  regarding  the  key  that  is  used,  we  must  have  H(K  |  Y,X)  =  0.  Of  course 
in  doing  this,  the  usual  interpretation  is  that  we  are  dealing  with  a  classical  cryptosystem.  But  here 
we  will  also  allow  public  key  systems;  even  though  the  public  key,  PK ,  is  known,  there  is  still  uncer¬ 
tainly  about  the  secret  key,  SK .  Thus  the  unicity  distance  ud  of  a  cryptosystem  in  which  both  plain¬ 
text  and  ciphertext  are  available  is  defined  as  the  value  of  N  ( =  cryptogram  length)  for  which 

H(K)-H(Y\X)  =  0, 


provided  such  an  N  exists. 

We  turn  now  to  the  special  case  of  public  key  cryptosystems  like  the  RSA  system.  In  such  sys¬ 
tems  given  the  plaintext  x  €  X,  the  cryptogram  Y  =  EK(x)  €  Y  is  determined  thus:  H(Y\X)  =  0. 
So  the  defining  equation  for  the  ud  becomes  (in  the  case  of  public  key  systems) 

H(K)  =0. 


This  is  not  completely  incorrect  because  the  public  key,  PK ,  is  known.  However,  it  docs  not 
consider  that  the  secret  key,  SK ,  is  not  determined,  so  H(K)  should  not  be  0. 
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CONCLUSIONS 

The  preceding  discussion  shows  that  the  present  state  of  information  theory  is  not  adequate  to 
handle  public  key  cryptosystems;  further  research  should  be  conducted  in  this  area.  It  would  be  valu¬ 
able  to  have  a  more  mathematically  precise  treatment  of  this  information  theoretic  approach  to  cryp¬ 
tosystems.  Neither  Shannon’s  nor  Meyer’s  formulations  are  sufficiently  precise.  A  more  precise 
treatment  itself  may  help  to  handle  the  case  of  public  key  cryptosystems. 

More  specifically,  if  we  are  going  to  adapt  these  methods  to  public  key  cryptosystems,  we 
should  determine  what  assumptions  about  the  set  of  keys  K  are  appropriate  and  realistic.  Although 
theoretically  the  set  of  all  possible  keys  for  a  public  key  system  like  RSA  is  infinite,  given  one’s  com¬ 
puting  capability  only  a  finite  number  can  actually  be  used.  Moreover,  further  assumptions  about  the 
probabilities  associated  with  these  keys  can  be  made.  Clearly,  if  the  key  is  too  small,  the  system  will 
not  have  sufficient  cryptographic  strength.  Thus  such  keys  could  be  assigned  low  probabilities  (or 
probability  0).  Also,  the  fact  that  the  key  for  enciphering  (PK)  is  known  but  the  key  for  deciphering 
(SK)  is  unknown  should  be  taken  into  account.  This  information  theoretically  says  that  if  A"  (  is  in  the 
set  of  PK s  and  K2  is  in  the  set  of  SK s,  then  H{K\ )  —  0  but  H(K2)  &  0.  (It  may  be  that  AT,  and  K2 
are  the  same  set,  but  they  need  to  be  distinguished  for  application  to  public  key  systems.)  As  men¬ 
tioned  earlier,  it  is  hoped  that  a  definition  of  unicity  distance  can  be  formulated  that  could  be  suitable 
for  handling  public  key  cryptosystems.  Moreover,  such  a  concept  could  serve  as  a  theoretical  lower 
bound  for  describing  the  complexity  of  such  systems. 
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